Approximating a Deep-Syntactic Metric for MT Evaluation and Tuning
نویسندگان
چکیده
0.279 0.490 0.029 BLEU 0.338 0.469 0.086 cap-macro approx 0.340 0.676 0.086 approx+cap-micro and BLEU 0.354 0.734 0.086 cap-micro approx 0.406 0.741 0.086 cap-micro orig 0.413 0.769 0.086 cap-micro tagger 0.413 0.769 0.086 cap-micro approx-restr 0.423 0.800 0.143 cap-macro orig 0.428 0.800 0.143 cap-macro tagger 0.608 0.800 0.400 cap-macro approx-restr Avg Max Min Overlapping Approximation Czech as a target language Overlapping
منابع مشابه
CUNI Experiments for WMT17 Metrics Task
In this paper, we propose three different methods for automatic evaluation of the machine translation (MT) quality. Two of the metrics are trainable on directassessment scores and two of them use dependency structures. The trainable metric AutoDA, which uses deep-syntactic features, achieved better correlation with humans compared e.g. to the chrF3 metric.
متن کاملPORT: a Precision-Order-Recall MT Evaluation Metric for Tuning
Many machine translation (MT) evaluation metrics have been shown to correlate better with human judgment than BLEU. In principle, tuning on these metrics should yield better systems than tuning on BLEU. However, due to issues such as speed, requirements for linguistic resources, and optimization difficulty, they have not been widely adopted for tuning. This paper presents PORT 1 , a new MT eval...
متن کاملA Customizable MT Evaluation Metric for Assessing Adequacy Machine Translation Term Project
This project describes a customizable MT evaluation metric that provides system-dependent scores for the purposes of tuning an MT system. The features presented focus on assessing adequacy over uency. Rather than simply examining features, this project frames the MT evaluation task as a classi cation question to determine whether a given sentence was produced by a human or a machine. Support Ve...
متن کاملTackling Sparse Data Issue in Machine Translation Evaluation
We illustrate and explain problems of n-grams-based machine translation (MT) metrics (e.g. BLEU) when applied to morphologically rich languages such as Czech. A novel metric SemPOS based on the deep-syntactic representation of the sentence tackles the issue and retains the performance for translation to English as well.
متن کاملA New Syntactic Metric for Evaluation of Machine Translation
Machine translation (MT) evaluation aims at measuring the quality of a candidate translation by comparing it with a reference translation. This comparison can be performed on multiple levels: lexical, syntactic or semantic. In this paper, we propose a new syntactic metric for MT evaluation based on the comparison of the dependency structures of the reference and the candidate translations. The ...
متن کامل